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Abstract 

Quantum de Finetti theorems are a useful tool in the study of correlations in quantum mul- 
tipartite states. In this paper we prove two new quantum de Finetti theorems, both showing 
that under tests formed by local measurements in each of the subsystems one can get a much 
improved error dependence on the dimension of the subsystems. We also obtain similar results 
' for non-signaling probability distributions. We give the following applications of the results to 

quantum complexity theory, polynomial optimization, and quantum information theory: 

We prove the optimality of the Chen-Drucker protocol for 3-SAT, under the assumption 
there is no subexponential-time algorithm for SAT. In the protocol a prover sends to a 
verifier s/n polylog(n) unentangled quantum states, each composed of 0(log(n)) qubits, 
as a proof of the satisfiability of a 3-SAT instance with n variables and 0(n) clauses. The 
quantum verifier checks the validity of the proof by performing local measurements on each 
of the proofs and classsically processing the outcomes. We show that any similar protocol 
with 0{n l / 2 ~ e ) qubits would imply a exp(n 1_2£ polylog(n))-time algorithm for 3-SAT. 

We show that the maximum winning probability of free games (in which the questions to 
each prover are chosen independently) can be estimated by linear programming in time 
cxp(0(log \Q\ + log 2 |^4|/e 2 )), with \Q\ and \A\ the question and answer alphabet sizes, re- 
spectively matching the performance of a previously known algorithm due to Aaronson, 
Impagliazzo, Moshkovitz, and Shor. This result follows from a new monogamy relation for 
non-locality showing that /c-extendible non-signaling distributions give at most a 0(fc -1 / 2 ) 
advantage over classical strategies for free games. We also show that 3-SAT with n variables 
can be reduced to obtaining a constant error approximation of the maximum winning prob- 
ability under entangled strategies of O(yfn) -player one-round non-local games, in which 
only two players are selected to send 0( v / n)-bit messages. 

We show that the optimization of certain polynomials over the complex hypersphere can be 
performed in quasipolynomial time in the number of variables n by considering 0(log(n)) 
rounds of the Sum-of-Squares (Parrilo/Lasserre) hierarchy of semidefinite programs. This 
can be considered an analogue to the hypersphere of a similar known results for the sim- 
plex. As an application to entanglement theory, we find a quasipolynomial-time algorithm 
for deciding multipartite separability. 

We consider a quantum tomography result due to Aaronson - showing that given an un- 
known n-qubit state one can perform tomography that works well for most observables by 
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measuring only 0(n) independent and identically distributed (i.i.d.) copies of the state - 
and relax the assumption of having i.i.d copies of the state to merely the ability to select 
subsystems at random from a quantum multipartite state. 

The proofs of the new quantum de Finetti theorems are based on information theory in par- 
ticular on the chain rule of mutual information. 

1 Introduction 

A central problem in quantum information theory, quantum computation, and physics in general 
is to understand entanglement, quantum correlations with no counterpart in classical probability 
theory. An important technique in the study of entanglement are quantum versions of the de 
Finetti theorem. The latter states that the marginal probability distribution p x ^— x i on I subsys- 
tems of a permutation-symmetric probability distribution p x ^- x h on k > I subsystems is close 
(within 1(1 — l)/k in variational distance) to a convex combination of independent and identically 
distributed (i.i.d.) probability distributions |39ll . This is a powerful result as it allows us to infer 
a very particular form for p Xl — x i merely based on a symmetry assumption on p Xl — x k t Note we 
can always make sure this assumption holds true by merely forgetting the order of the k subsys- 
tems. Quantum versions of the de Finetti theorem state that a /-partite quantum state p A i— A i that 
is a reduced state of a permutation-symmetric state on k > I subsystems is close (for k » to a 
convex combination of i.i.d. quantum states, i.e. p Al — A i « J i_i(da)a® 1 for a probability measure /i 
on quantum states. 

The quantum version appears very similar to the original de Finetti theorem, but it is much 
more remarkable. Not only it says that the correlations are arranged in an organized fashion (as a 
convex combination of i.i.d. states) but also that the state of I subsystems is close to a separable, non- 
entangled, state. A well-known property of entanglement is that it is monogamous: A quantum 
system cannot be very much entangled with a large number of other systems. The quantum 
de Finetti theorems provide a quantitative statement for the monogamy of entanglement; in a 
symmetric state all the subsystems are equally correlated with all the others and so each of them 
can only be slightly entangled with a few of the others. 

We now know several possible quantum versions of the de Finetti theorem Il47l l82l l44l [77l l86l 
l27l l62l l32l l80l [70l |22l . A natural way to quantify the closeness to convex combinations of i.i.d. 
states is by the trace norm 0. In this case Christandl, Konig, Mitchison, and Renner Il32l proved 
an almost optimal quantum de Finetti theorem: p Al - A i is (2d 2 l/k)-close to a convex combination 
of i.i.d. states in trace norm, with d the dimensional of the subsystems, while there are examples 
where the error is £l(dl/k). However in many applications this error is too large to be useful. One 
possible way forward is therefore to consider other ways of quantifying the approximation rather 
than the trace norm. 

There are two known quantum de Finetti theorems following this idea. The first is the expo- 
nential de Finetti theorem of Renner [80J, that achieves an exponentially small error ink — I, but 
only shows that p A ^- A i is close to a convex combination of "almost i.i.d." states, a generalization 
of i.i.d. states having similar properties with respect to certain statistical tests. The second is the 
de Finetti theorem proved in Ref. |24| . which works for / = 2 and has an error of ^/l61n(d) jk, 
an exponential improvement on the dimension dependence. The approximation is quantified by 

1 The trace norm gives the maximum probability of distinguishing two quantum states by arbitrary measurements. 
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the one-way loccEI norm, a variant of the trace norm for bipartite systems in which only mea- 
surements implementable by local operations and one-directional classical communication are 
allowed. Both results have found interesting applications: The first to quantum key distribution 
H79l , quantum hypothesis testing 11231 , and quantum state tomography 11801 ; the second to entan- 
glement testing, where it gives a quasipolynomial-time algorithm for determining if a quantum 
state is entangled or not [22 j, and to quantum complexity theory |22l . These two results suggest 
that more quantum versions of the de Finetti theorem might exist. In this paper we show that this 
is indeed the case. 

It has emerged that some of the properties of entanglement, such as its monogamous character, 
are shared by more general classes of correlations Il67ll . A particular interesting example is the 
class of non-signaling distributions, which are a generalization of the correlations attainable by 
quantum mechanics. Versions of the de Finetti theorem for non-signaling distributions have also 
been derived Il33l[l2l , although here again the scaling of the error - linear in the number of possible 
measurements - has limited the applicability of the results. 

Another way to study quantum entanglement is via its role in operational tasks, e.g. in quan- 
tum key distribution and quantum computation. One fascinating case is the role of entanglement 
in quantum proof systems. The goal there is to understand how useful are entangled states for 
convincing a verifier the truth of a mathematical statement. There are many settings, such as 
interactive or non-interactive protocols, one or multiple provers, and which type of communi- 
cation is allowed among the provers and the verifier (see e.g. IMlP . In this paper we will be 
concerned with two such settings in particular. The first is M I P*, in which the provers share entan- 
glement (or even general non-signaling correlations) and are only allowed to communicate with 
the verifier and not with each other l60l . The second is QMA(/c), meaning non-interactive mul- 
tiple proof protocols with the assumption that the proofs are not entangled l6ll . Here we have 
the interesting situation where the assumption of not having entanglement among the proofs 
appears to give extra power to the proof system. Both settings have been extensively stud- 
ied in the past (see e.g. J35l [85l ED M [53 [53l [5H [3H EH EH for work on MIPVQMIP and 
13[l3[!6l[23[20l[2![Il[6^ work on QMA(/c)), although there are still 

many interesting open questions concerning them. 

2 Results 

The main result of this paper are two new quantum versions of the de Finetti theorem, along with 
extensions to arbitrary non-signaling distributions. Both are based on a coarser notion of approx- 
imation to the target state than the trace norm, but as a pay-off their error scales exponentially 
better with dimension. The notion of approximation used is that two quantum states are close if 
they have the same statistics under any local measurements on the subsystems. Our results thus 
extend the de Finetti bound of Ref. 11221 to an arbitrary number of subsystems while improving on 
the error term, generalizing it to general non-signaling distributions, and in some cases providing 
an explicit rounding scheme. Among the applications of the new quantum de Finetti theorems we 
address two problems in quantum complexity theory, each concerning one of the proof systems 
mentioned above. Below we give a brief description of these applications. 

2 The name LOCC stands for local operations and classical communication. See Eq. {91} for a precise definition of 
one-way LOCC. 
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Multiple Unentangled Proofs: The first application concerns a protocol due to Chen and Drucker 
||29| in which a prover sends to a verifier y/n polylog(n) unentangled quantum states, each com- 
posed of 0(log(n)) qubits, as a proof of the satisfiability of a 3-SAT instance with n variables and 
0(n) clauses. The quantum verifier then checks the validity of the proof by performing local quan- 
tum measurements on each of the proofs and post-processing the outcomes. This result (building 
on (2J), is surprising since one can convince a verifier the satisfiability of a 3-SAT instance by send- 
ing only poly log (ra) qubits! It is a natural question whether the total number of qubits could 
be decreased even further. As a direct application of one of the new quantum de Finetti theo- 
rems we give strong evidence against any further decrease: We show that any similar protocol 
with 0(n 1 / 2 ~ e ) qubits, for any e > 0, would imply in a exp(n 1-2e polylog(n))-time algorithm for 
3-SAT. This proves the optimality of the protocol under the plausible assumption that there is no 
subexponential-time algorithms for SAT 11491) . 

A related, but harder, problem is whether QMA(2) protocols can give at most a quadratic re- 
duction in proof size with respect to qmaEB We believe the result we obtain gives evidence that 
this might be the case and that a suitable quantum version of the de Finetti theorem might be the 
right tool to show itH. 

Non-local Games: The second application concerns the computational complexity of non-local 
games. We give two results in this direction. The first is algorithmic and concerns the class of 
free games, defined as games in which the questions to each prover are chosen independently. We 
show that the maximum winning probability of such games can be approximated within additive 
error e by a linear program in time exp(0(iog \Q\ + log 2 |^4|/e 2 )), with \Q\ and \A\ the question and 
answer alphabet sizes, respectively. The run-time matches the performance of a different algo- 
rithm for the problem due to Aaronson, Impagliazzo, Moshkovitz, and Shor |3"fi Although this 
is a purely classical result, we establish it by exploring a connection to non-local games: We show 
that for any two-player one-round free game, one can find another game on m players such that 
the maximum winning probability under non-signaling strategies, which can be computed by a 

linear program (5T1 , gives a y^^j^ -additive approximation to the maximum winning probability 
of the original game. Note that since non-signaling strategies are at least as powerful as entangled 
strategies, the same result holds also for games in which the players share entanglement. 

Using the observation above for entangled strategies, together with a hardness result for free 
games from (3), we also show that 3-SAT on n variables can be reduced to obtaining a constant error 
approximation of the maximum winning probability under entangled strategies of 0( v / n)-player 
one-round non-local games, in which the players communicate 0{y/n) bits all together. Finally, we 
show how one would be able to establish N P-hardness of approximating the maximum winning 
probability under entangled strategies of a 4-player one-round game if one could strengthen ap- 
propriately one of the new quantum de Finetti theorems of this paper. This gives a new approach 
to this problem, which is one of the most outstanding open questions concerning non-local games. 

Polynomial Optimization: We consider the connection |4T1 HOl l43l between quantum de Finetti 
theorems and the optimization over separable states, on one hand, and polynomial optimiza- 

3 QMA is the quantum version of NP. QMA(2), in turn, is a version of QMA in which one is given two proofs, with 
the promise they are not entangled with each other; see section l2~3l 

4 By Ref. 1461 we know QMA(2) with constant soundness gives at least a quadratic reduction in proof size relative to 
QMA, under plausible computational complexity assumptions; see section l2~3l 

5 See 1 22 . 46] for more evidence this might be the case, along with obstacles to prove it. 

6 This algorithm was communicated to us already in 2010, although the result has appeared publicly only in |3). 
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tion and the Sum-of-Squares (Parrilo/Lasserre) hierachy, on the other hand, and prove that 
the optimization of certain degree-d polynomials over the n-dimensional hypersphere can be 
approximated to error e in quasipolynomial-time in the number of variables by considering 
0(log(n)d 2 e~ 2 ) rounds of the Sum-of-Squares hierarchy of semidefinite programs. This result can 
be considered as an extension to the hypersphere of similar results for the simplex ||76| . Moreover 
employing the result of Chen and Drucker 129H , we show that Sl(c? 2 ) rounds are necessary to obtain 
even a constant error-approximation, unless there are subexponential-time algorithms for SAT. 

Separability Testing: Another application is to give an algorithm for deciding separability of 
multipartite states which is quasi-polynomial in the local dimensions of the subsystems. Given a 
multipartite state pAx,...,A X i we prove one can decide whether it is fully separable or e-away from 
separable in time exp (^(^2 k In |Afc|) 2 Z 2 e~ 2 ^, with distance measured either by the one-way 

LOCC norm ll2l) or by a multipartite version of the Frobenius norm introduced in l63l . This gen- 
eralizes the findings of Il22l from bipartite states to general multipartite states, and vastly improves 
on the bound of |2H . 

Efficient State Tomography: A final application of the new de Finetti theorems is to quantum 
state tomography. The starting point is a result due to Aaronson |QQ|, based on computational 
learning theory, showing that given an unknown n-qubit state one can perform tomography that 
allows us to compute to good accuracy the statistics of most observables by measuring only 0(n) 
i.i.d. copies of the state. The new de Finetti theorem we prove allow us to relax the assumption 
of having i.i.d. copies of the state (which can never be fully certified), showing that essentially 
the same conclusion holds true for arbitrary quantum states, as long as one can selects a few of 
its subsystems at random and performs the original scheme on them (weakening however the 
number of subsystems needed from 0(n) to poly(n), of which only 0(n) are measured and the 
rest discarded). 

Notation: Let V(H.) be the set of quantum states on %, i.e. positive semidefinite matrices of unit 
trace acting on the vector space H. We say p AB G T>(A (g> B) is a fc-extendible state if there is 
a state p AB ^- B k G v(A ® B® k ) such that p AB > = p AB for all j G [k]. Let Sep(A : B) denote 
the set of separable states in V(A <g> B), i.e. Sep(A : B) = {p AB : p = J2iPi a f ® We say 
pA t ...A k g x>(A® k ) is permutation symmetric if pr^W-^W = p A t- A * for any permutation ir G Sk 
(with Sk the symmetric group of order k). 

A quantum measurement (also called a POVM or positive-operator valued measure) is given 
by a set of matrices {M^} such that > and = I. We associate to any measurement 

a map A(X) = J2k t r (-^fc- 5 OI&)(&|/ with {\k)} an orthonormal basis. We denote the set of maps 
associated to measurements by M . These are also called quantum-classical channels, since they 
map quantum states to probability distributions. 

Letp(xi, . . . , Xf.\ai, . . . , a^) G X xk x A xk be a conditional probability distribution. We say it is 
non-signaling if p(xj\aj) is independent of for k / j. We say p(x, y\a, b) is /c-extendible if there 
is a non-signaling distribution p(x, yi,..., yk\a, 61, ... , bk) which is permutation-symmetric in the 
B systems, i.e. p(x,TT- 1 (y 1 ), . . . ,7r _1 (y fe )|a,7r _1 (6i), . . . ,-K- l (b k )) = p(x,yi,.. . ,y k \a,h, . . . ,b k ) for 
all permutations tt G Sk, and whose marginal is p(x, y\a, b). We call LHV (local hidden variable) 
the set of conditional probability distributions of the form p(x, y\a, b) = J2i 7r iQi{ x \ a ) r i(y^ \ b) for a 
probability distribution tt and local conditional distributions qi, r\. 
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2.1 Quantum de Finetti Theorems for Local Measurements 

By monogamy of entanglement we expect that a /c -extendible state p AB to be close to a separable 
state, since the A subsystem is equally correlated to k systems. The next theorem gives a quantita- 
tive version of this fact both for entanglement and for non-signaling distributions. 

Theorem 1. 



1. Let p AB G V{A ® B) be a k-extendible state and p,(m) a distribution over quantum operations 
{A A ,m} m , with A A , m ■ V(A) -> V{X). Then 

mm max E llA^^A^ (p AB - a AB ) II < ,/ 21n l X l . (!) 

2. Let p AB G V{A ® B) be a k-extendible state, p(m) a distribution over operators {A^ m } m from 
V(A) — > V(X) and Ab a measurement on V(B). Then in time poly(|A|, \B\ k ) a classical computer 
can compute a G Sep(^4 : B) such that 

E \\A A>m ®A B (p AB -a AB )\\ <J*J*S.. (2) 

3. Let p(x, y\a, b) G XxyxAxBbea k-extendible non-signaling conditional probability distribution 
and let pbea distribution over A. Then 



min max E \\p(x,y\a,b) — q(x,y\a,b)\\ 1 < 

qeLHV b£B a~/i 



The most important aspect of the theorem is that the error term is independent of the sub- 
system dimensions of p AB , and only depends on the output dimension of the family of quantum 
operations {Aa,^}^- Likewise, for non-signaling distributions the bound is independent of the 
number of measurement settings of p(x, y\a, b). 

The de Finetti bound from Ref. II22H can be recovered (with an improved constant) as a special 
case of part 1 of Theorem[TJby choosing the singleton distribution composed of the ideal channel 
on A, since 

\\P AB ~ ^ B ||locc- = max ||(id ® A){p AB - a AB )h. (4) 
A&M 

We remark that this result also follows from the work of Yang 187] using the fact that the entan- 
glement of formation IfTBl is upper-bounded by the log of either of the local dimensions, together 
with a variant of the Pinsker inequality adapted to LOCC^ (75|. It also follows from the recent 
work of Li and Winter ||66| .. 

The proof of Theorem [T] (found in Section [3]) is more direct and general than the proofs in |22l 
IHZHSU, in particular not making use of entanglement measures in any explicit way. This enables us 
to obtain parts 2 and 3 of the theorem (but see the discussion of Conjecture|5]for an example of how 
the generality of Theorem [T] limits our abilities to further improve it). We remark that the explicit 
rounding in part 2 was mostly known only for the variants of the de Finetti theorem requiring 

7 The one-way LOCC norm is defined as ||X|| L occ^ = maxo<A/<z tr(XM), with the the maximization over all 
POVMs {M, I — M} that can be realized by local operations and one-way classical communication from B to A. 
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k > d l|80l l32l [70l |43| , and the previous de Finetti theorems for non-signaling boxes If33l [12) were 
similarly inefficient. The main exception to this is flTTll , which achieves a similar but incomparable 
bound for measurements with nonnegative matrix elements, together with an efficient rounding 
scheme. Ref. (Till was also an important source of inspiration for the current work. 

The next theorem gives a generalization of the result of ||22) to an arbitrary number of subsys- 
tems, as well as to non-signaling distributions. 

Theorem 2. 

1. Let p A i- A k £ V(A® k ) be a ■permutation-invariant state. Then for every < I < k there is a measure 
v on V(A) such that 



max 

A 2 ,...,A;GA4 



(id ® A 2 ® . . . ® A z ) (pAi-Ai _ J v (do)a® l ^J < 



<2l 2 ln\A\ 
k-l 



(5) 



2. Let p(Xi • • • Xk\Ai • ■ • Ak) be a permutation-invariant non-signaling conditional probability distri- 
bution (i.e. p is invariant under simultaneous permutation of the X and A systems). Fix a product 
distribution fj, = fj,i ® ■ ■ ■ ® p,^ on A\ x ■ ■ ■ x A),. Then for every < I < k there is a measure v on 
single-system conditional probability distributions such that 



E 



p(Xx ■ ■ ■ Xi\at, . . . , ai) - E q(Xi\at) 



l(Xi\ai] 



2Pln\X\ 



In Ref. ||39H Diaconis and Freedman proved that for a permutation-symmetric probability dis- 



tribution pi~ on k subsystems, pi is 



close (in variational distance) to a convex combination of 



i.i.d. probability distributions. Theorem [2] can be seen as an analogue of this result to quantum 
states and non-signaling probability distributions. However instead of having a bound which is 
independent of the dimension, we only have a bound that depends logarithmic on the dimension 
(and the notion of approximation is weaker than variational distance). It is an interesting ques- 
tion whether this can be improved. Note however that we give in Section IZ3l a computational 
complexity argument that the l 2 /k dependency is optimal. 



2.2 Non-Local Games: Algorithms and Hardness Results 

One application of Theorem [T] is to the computational complexity of non-local games. A mul- 
tiprover game is played between a set of cooperative players /pro vers, who are not allowed to 
communicate with each other, and a ref eree / verifier who interrogates the provers to decide if 
they win the game. In a one-round game, for example, the verifier chooses questions to each 
prover at random and checks the answers obtained from the provers in order to decide whether 
to accept or not. Even though the provers cannot communicate with each other, they can agree on 
a common strategy in order to win the game with the maximum probability possible. 

Multiprover games have had a central role in computational complexity theory. In a seminal 
paper Babai, Fortnow, and Lund proved NEXP = MIP [9J, with MIP the class of languages having 
multi-prover interactive proof with a polynomial number of provers, rounds, and bits exchanged 
between the provers and the verifier in each round. Building on Q, it was then proven in JSKZl 
that it is N P-hard to approximate to constant error the maximum winning probability of a two- 
player one-round game (with the input size given by the total number of questions to the players 
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and their answers). This hardness result is equivalent to the celebrated PCP theorem 001, which 
has a pivotal role in hardness of approximation results (see e.g. |U). 

It is natural to allow the players to share correlations that might assist them in winning the 
game with a higher probability. While it is easy to see that shared randomness is of no help, 
it has been known since the seminal work of Bell [15j that entanglement might help the play- 
ers to win with a probability strictly larger than with a purely classical strategy. One can even 
consider stronger correlations than the ones allowed by quantum mechanics, such as arbitrary 
non-signaling correlations. Games in which the players can use entanglement (or more general 
non-signaling correlations) are known as non-local games, since the extra shared resources allow 
the players to sometimes use strategies that cannot be reproduced by local ones (i.e. strategies 
only using shared randomness and local actions). Upper bounds on the maximum winning prob- 
ability of a one-round game under classical strategies are known as Bell inequalities, and non-local 
strategies that beat these bounds are known as Bell inequality violations. Such violations of Bell 
inequalities are central in the foundations of quantum mechanics as they can be implemented 
experimentally to show that nature cannot be described by a local hidden variable theory [8J. 

Given the usefulness of multiprover games to computational complexity theory and of non- 
local games to the foundations of quantum mechanics, it is interesting to study how difficult it is 
to compute the entangled value of the game, defined as the maximum probability of winning the 
game using entanglement, or the non-signaling value of the game, defined as the optimal probabil- 
ity under non-signaling strategies. By contrast, the maximum winning probability under classical 
strategies is called the classical value of the game. 

Although a priori computing the entangled value of a game requires optimizing over a large 
set, in some cases this can be easier. Indeed, for unique games, the best known algorithms for the 
classical value 13 run in time exp(n e ) (with < e < 1 depending on the desired degree of approx- 
imation), whereas the entangled value of the game can be estimated in polynomial time using 
semidefinite programming ||57| (or exactly calculated for the special case of XOR games ||35| ). 
These two classes of games could be taken as evidence that the estimation of the entangled value 
is generally easier than of the classical value. However if one is interested in a high-accuracy es- 
timation this turns out not to be true. Kempe, Kobayashi, Matsumoto, Toner, and Vidick proved 
that it is N P-hard to approximate to an inverse polynomial (in the size of the game) the entangled 
value of one-round 3-prover games I156H (see also |53H52H34]|). Recently in a beautiful development 
Ito and Vidick l54ll proved that it is N P-hard, under quasi-polynomial reductions, to approximate 
the entangled value of 3-prover games with polynomially many rounds even to constant error. 
The result lEIfl has a more elegant formulation in terms of interactive proof systems: It shows that 
NEXP C MIP*, with MIP* the analogue of M IP in which the provers share entanglement |60|. The 
maximum probability of non-signaling strategies, in turn, can always be computed efficiently by 
linear programming |5lf . 

Probably the biggest open question in this area is to determine the computational complexity 
of approximating the entangled value of one-round games to constant accuracy. There are two 
reasons why this is a particular interesting setting. The first is the fact that the PCP theorem can 
be stated as the N P-hardness of approximating the classical value of one-round games to constant 
accuracy. Thus an analogous result for the entangled value could be interpreted as a version of the 
PCP theorem in the presence of entanglement. Second, in Bell inequality violation experiments, 
which are one-round non-local games, one can only obtain a constant-accuracy approximation to 
the true violation due to experimental error. Therefore it is important to understand how effi- 
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ciently one can estimate to constant error the maximum violation of a Bell inequality, since this 
the most experimentally relevant approximation scale. One of our goals here is to propose a new 
approach to address this problem. 

A particular class of games that we will consider are the so-called free games, defined as games 
in which the questions to each of the players are chosen independently from the questions to the 
other players ||26| . These games are usually considered in physics in the context of violations 
of Bell inequalities, the CHSH game being an example. The fact that the verifier cannot coordi- 
nate questions suggests that the computation of the maximum winning probability of such games 
should not be as hard as for general games. And indeed Bellare, Feige and Killian proved that the 
analogue of MIP for poly-round free games is equal to PSPACE llT6ll , while Aaronson, Impagliazzo, 
Moshkovitz, and Shor [3]| proved that the classical value of one-round free games with questions 
to the two provers in Q x Q and answers in A\ x A2 can be simulated to within error e by AM 
(Arthur-Merlin) proofs with an <9(log \Q\ + log(|Ai| • | A2 |)/e)-bit message from Arthur to Merlin 
and an 0(log \A\ \ log 1 AO-bit message from Merlin to Arthur. As a result, the value of such 
games can be estimated in time poly(log \Q\) exp(log \A\\ log I^IA)- They also gave a matching 
hardness of approximation result for free games, showing that one can reduce 3-SAT on n binary 
variables to computing ui c (G) to within constant additive error for 2-p layer one-round free games 
with exp(0(y / n))-sized answer alphabet). 

As a corollary of Theorem [T] we will prove that the classical value of free games can be com- 
puted efficiently by linear programming, matching the run-time of the algorithm of 0. Moreover, 
we will also derive a non-trivial hardness of approximation result for the entangled value of non- 
local games by importing to the case of entangled strategies the hardness of approximation result 
for the classical value of free games from . Finally we will show how a conjectured strengthen- 
ing of Theorem [T] would imply in the NP-hardness of obtaining a constant error approximation of 
co e for four-player one-round games. 

Before we turn to the precise statement of the main result of this section let us give a more 
formal definition of non-local games. 

Definition 3. We define a m-prover game G(m, tt, V) by two parameters it and V: 

1. 7r is a probability distribution on Qi x . . . x Q m for finite sets Qi, . . . , Q m . 

2. V is a predicate on Qi x . . . x Q m x A± x . . . x A m for finite sets A\, . . . , A m . 

The sets Qi and Ai consist of the possible questions and answers, respectively, for player i. The predicate 
< V(a\, . . . , a m \qi, • • • , q m ) < 1 is the pay-off function of the answer (01, ... , a m ) given the question 
(qi, . . .,q m ). 

The classical value of the game G is given by 

uj c (G(m,TT,V)) := max } n(qi, . . . , q m )V(ai(qi), . . . , a m (q m )\qi, . . . , q m ), (7) 
ai,...,a m * — ' 

qi,...,q m 

where the maximum is over all functions a,j : Qj — > Aj . 

8 There are suggestive similarities between this result and results about QMA(2) and variants thereof; see Section l23l 
and 061. 



The entangled value of the game, in turn, is given by 

uj e (G(m,7r,V)) 

:= sup *■(?!>■ ■■»?.») Yl V ( a ^---' a rn\QU---,Qm)WM^ ]qi ^....^M2 llqm \^),(8) 

qi,—,q m ai,...,a m 

where the supremum is over states \ip) of arbitrary dimension and arbitrary POVMs 

{ M i| 9l }aieAi, • • • ,{ M r m \q m }a m £A m , (9) 

with J2a k eA k M a k \ qk = 1 for ever Y 9fc G Qfc and fc G [m]. 

Finally the non-signaling value of the game G is defined as 

uj ns (G(m,-K, V)) 

:= max ^ 7r(gi, . . . , q m ) ^ V(ai, . . . , a m \qi, . . . , q m )p(ai, ■ ■ ■ i 0"m | qi > • • • j 9m ), (10) 

where the maximum is over all non-signaling probability distributions p(a\, . . . , a m \qi, . . . , q m ). 
Corollary 4. 

1. Let G(2, it, V) be a two-player one-round non-local free game with it a product probability distribu- 
tion on RxQ and V a predicate on RxQxAxB. Then there is a (m+l)-player one-round non-local 
game G{m + 1, W, V) with If a probability distribution on R x Q\ x . . . x Q m , with \Qk\ = \Q\ for 
k G [m], and V a predicate on R x Qi x . . . x Q m x A x B\ x . . . x B m , with \Bk\ = \B\ for 
k G [m], such that 



oj c (G) = oo c {G) < oj e (G) < oo ns {G) < oj c (G) + (H) 

In \A\ 

2. For a free game G(2, ir, V) there is a linear-programming relaxation of size \R\\A\ (\Q\\B\) 2e 2 for 
computing uj c {G) to within additive error e. 

3. One can reduce 3-SAT on n variables to computing uj e {G) to within constant additive error for 
0{y/n)-player one-round non-local games with answer alphabet size o/exp(0( A /n)) in which only 
two players are asked questions. 

See Section [5] for the proof. 

We note that it is trivial to prove either a version of part 3 of Corollary H] in which the answer 
alphabet size is 2 n (in which case even one prover is clearly enough), or one in which the answer 
alphabet size is constant but one has n provers, or one with y/n provers and alphabet size 2^" in 
which all provers respond. However, in our result, the total number of bits sent is 0(\/n). 

Part 2 of Corollary H] follows directly from part 1 and the fact that u ns can be computed by 
linear-programming. This gives a new algorithm matching the performance of the algorithm due 
to Aaronson, Impagliazzo, Moshkovitz, and Shor [3]. Part 3 of Corollary H] follows from part 1 and 
the hardness of approximation result of Ref. [3 J for free games. 

Part 1 in turn gives a generic relation between the classical value of a free game, on one hand, 
and the quantum and non-signaling values of a modified game with more players, one the other 
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hand. The idea of adding more players is to try to immunize the original game from entanglement 
(or general non-signaling correlations) by adding extra consistency tests that forces the entangle- 
ment between the players to have a specific form. Indeed the new game with m+1 players consists 
of playing the original game with player one and one of the remaining m players chosen at ran- 
dom. This essentially allows us to consider a two-player game where the provers can only share 
an m-extendible state (or m-extendible non-signaling conditional distribution). Then by Theorem 
[JJwe obtain that this m-extendible state cannot be much better than a separable state or a local hid- 
den variable distribution (which themselves are no better than just having shared randomness). 
The crucial aspect of Theorem [TJ used here is that the error term only depends on the number of 
outcomes (which is given by the number of possible answers of the non-local game in question), 
and not on the dimension of the entangled state or on the number of different POVMs in the fam- 
ily in the quantum case (or the number of measurement settings in the non-signaling case). The 
idea of immunizing entanglement by introducing more players is not new and was used before 
by Kempe et al Il56ll to prove the hardness of estimating the entangled value within error inverse 
polynomial in the size of the game. 

More generally it was observed by Terhal, Doherty and Schwab Il83ll that m-extendible states 
cannot violate any Bell inequality with fewer than m measurements for Bob (and an arbitrary num- 
ber of measurements for Alice). In contrast Theorem [TJ shows that a non-signaling ?n-extendible 
conditional distribution can violate a Bell inequality associated to a free game (an example of 
which is the CHSH inequality) with an arbitrary number of measurements, each with M possible 

outcomes, by at most \ v/^p^- This is an instance of the concept of monogamy of entanglement 
(which is known to hold true for non-signaling distributions as well |33|), in this case to the non- 
locality of quantum states (i.e. the maximum possible violation of a Bell inequality). Note that to 
be e-close to a separable state in trace norm (thus having similar statistics under general quantum 
measurements) one must consider m-extendible states with m = Q,(\B\/e), with \ B\ the dimension 
of the B subsystem l32l . The monogamy of non-locality we find here, in comparison, has a bound 
that is independent of the dimension of the state. 

Finally let us mention a conjecture whose validity would imply the N P-hardness of estimat- 
ing oj e to within constant error for 4-player one-round games. The conjecture is the following 
strengthening of Theorem [TJ 

Conjecture 5. Let p AB G V(A <g> B) be a k-extendible state and p(m) a distribution over quantum 
operations {AA jm } m , with A^m : T)(A) — > ^{X). Then 



where denotes the expectation over p. 

The difference with TheoremQJis that the order of the expectation over p and the maximization 
over measurements is reversed. It is easy to check that one would be able to carry through the 
proof of part 1 of Corollary [4] given in Section [5] for general games (of course only for the relation 
of oj e and lo c ). The fact that we would be able to prove N P-hardness for 4-player games would 
then follows from the combination of this stronger version of Eq. (fTTT) with a recent version of the 
PCP theorem due to Khot and Safra, in the language of two-pro ver one-round games |59| . 

Although the conjecture is consistent with all the examples of states we are aware of, we note 
that a proof would have to follow a very different approach to the one used in Theorem [TJ as it 




(12) 
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cannot apply to non-signaling distributions. In this respect the hypothesis testing approach of 
Refs. |22l[66l might be a promising route. 

2.3 Optimality of Chen and Drucker's Multiple-Proof Protocol for 3-SAT 

One first application of Theorem [2] is to unentangled multiple proof systems. 

Given a 3-SAT formula with n variables and 0(n) clauses, what is the minimum proof that 
can convince a verifier the formula is satisfiable? Under the exponential time hypothesis |48l - 
that says 3-SAT cannot be solved in subexponential time - Q(n) bits are required, i.e. it is believed 
one cannot do anything substantially better than just write down the n-bit satisfying assignment. 
What if we can send a quantum state as a proof to a verifier who has a quantum computer to 
check its validity? Perhaps we could pack more information into the quantum state so that o(n) 
qubits would be enough to convince the verifier? It turns out that assuming a quantum version 
of the exponential time hypothesis - namely that to solve 3-SAT takes exponential time even on a 
quantum computer (see e.g. fY7\ for the oracle version of this claim) - Q (n) qubits are required [68 J. 

Quantum mechanics allows us to add a new twist to this question. What if we want to convince 
a quantum verifier by sending a quantum state to her, but with the promise that parts of the 
quantum state are not entangled with each other? In this case the argument of Ref. [68] does not 
apply anymore and at least we do not have any implausible consequence for having a sublinear 
proof. And indeed Aaronson, Beigi, Drucker, Fefferman, and Shor [2| (building on HD) proved 
that y / npolylog(?T,) unentangled quantum states, each of log(ra) qubits, are enough to convince a 
quantum verifier that a 3-SAT instance with n variables and 0(n) clauses is satisfiable. 

The result of |2) was strengthened in two directions: First Harrow and Montanaro ESI 
proved that two unentangled proofs, each of y / npolylog(n) qubits, are sufficient. Second Chen 
and Drucker 129H showed that A/npolylog(n) identical unentangled quantum proofs of 0(log(n)) 
qubits each are sufficient to convince even a verifier who measures each of the proofs separately 
and postprocess the outcomes in order to decide whether to accept or not. 

To state the main result of this section we define a few quantum complexity classes (see Section 
[6] for formal definitions). The first is a natural quantum analogue of NP (more precisely of MA). 
Let QM A n (c, s) be the class of problems such that: (i) for "yes" instances there is a quantum proof 
composed of n qubits that makes the verifier, who has access to polynomial quantum computa- 
tion, to accept with probability at least c; and (ii) for "no" instances every proof is accepted with 
probability at most c. Let QMA n (m, c, s) be the analogue of Q MA in which instead of one quantum 
proof the verifier receives m quantum proofs, each of n qubits, with the promise that they are not 
entangled with each other |6IJ . 

Further let BellQMA n (m, c, s) be an analogue of QMA n (m, c, s) in which the verification pro- 
cedure is restricted to applying independent measurements to each of the m proofs and then 
post-processing the outcomes classically [2j. The name of the class comes from the fact that 
the verifier is basically constrained to apply a Bell test as his verification procedure. Finally let 
BellSymQM A n (m, c, s) be the analogue of BellQMA n (m, c, s) in which all the m proofs are promised 
to be identical. 

With this notation the Chen-Drucker result can be stated as showing the containment of 3-SAT 
with n variables and 0(n) clauses in BellSymQMA log ( n )( A /npolylog(n), 1 — 2~ a (v") ) \j poly(n)) (29ll . 
A corollary of Theorem[2]is that this is essentially optimal, i.e. the square-root improvement found 
for the total proof size is all there is if we restrict ourselves to BellSymQMA protocols. 
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Corollary 6. 



1. BellSymQMA n (m, c, s) C QMA 10n 2 m 2/ £ 2(c, s + e). 

2. For ecery e > and c — s = $7(1), £/zere zs no BellSymQMA /i g(n))( n5_e j c, s) protocol for 3-SAT 
with n variables and 0(n) clauses, unless 3-SAT can be solved in exp(n 1_2e polylog(n)) time. 

3. BellQMA n (m, c, s) C QM A 10n 2 m 3 /e 2 (c, s + e). 

4. QMA poly(n) (|, |) = BellQMA poly(n) (poly(n), §, ±) 
See Section [6] for the proof. 

In II20U22II it was shown that BellQM A(m) is contained in QM A for a constant number of provers 
m. Corollary[6]strengthens the containment to even to a polynomial number of provers. This gives 
a new characterization of the class QMA and shows that the only advantage (in the regime where 
c — s > 1/ poly(n)) that BellQM A protocols can offer is a polynomial reduction in the proof size, 
such as in the protocol of Il29ll . 



2.4 Polynomial Optimization and Sum-of -Squares Proofs 

Another application of our main theorems is to classical algorithms for maximizing polynomials 
over C n . The concepts of fc-extendable and separable states turn out to correspond naturally to 
SDP hierarchies for polynomial optimization, and thus we are able to prove convergence of these 
hierarchies for polynomials that correspond to LOCC measurements. This connection was first 
established by Doherty, Parrilo and Spedalieri ll4"Tll , and was more recently made quantitative for 
general polynomials over the unit sphere in R n by Doherty and Wehner Il42ll43l . 

In this section, we consider the problem of maximizing real-valued polynomial functions 
over the complex unit sphere S 2n ~ l C C n . More precisely, we consider polynomials of 
z\, . . . , z n , z\, . . . , z n that are bihomogenous of degree d, d (i.e. homogenous of degree d in the 
zi,. . . ,z n and homogenous of degree d in the 21, ... , z n ). This problem is closely related f37| to 
optimization over the real unit sphere, though not always identical l36ll . When d > 1, this is gen- 
erally NP-hard; see [38|. A promising general-purpose approximation scheme is to use an SDP 
hierarchy invented independently by Parrilo Il73l and Lasserre l64l ; see also Il72l for a recent re- 
view of the complexity-theoretic properties of this hierarchy. To define the hierarchy we introduce 
some notation. Let C[z, z] := C[zi, . . . , z n , z±, . . . , z n ] denote complex polynomials in n variables, 
let C[z, z]d,d denote the set of bihomogenous polynomials of degree d, d, and let C[z, z]* d denote the 
set of Hermitian linear functionals from C[z, z]d to R. Here we will consider only Hermitian linear 



functionals L, meaning that L\\Y } 



=1 w 1 



] for any 01, . . . 



a„ 



If p(z) G C[z, z]d ) d and k > d, then we can upper bound max z6 52n-i p(z) with the following 



SDP: 



maxL(p) such that 
L(l) = l 

L(qq) > 

L({z\z\ + . . . + z n z n )q) = L(q) 



\/q G C[z,z] fc)0 
\/q E C[z,z] fe _i k _i 



(13a) 
(13b) 
(13c) 
(13d) 
(13e) 
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Here (|13c[) and (|13d|l are constraints that any collection of moments should satisfy (with (|13b|) 
enforcing linearity), while (|13e|) expresses the X^ILi l z «| 2 = 1 constraint (and can in general be 
replaced with any polynomial constraint; see 117311641 17210 . To see that (fl3]> is an SDP, observe that 
(I13e|> is a linear constraint and (|13d|) is equivalent to the constraint that the moment matrix M (L) > 
0, where the entries of M(L) are indexed by monomials in C[z, z]& and are defined by M(L) a ^ := 
L(z a z^). We can interpret this SDP as replacing the maximum over g 2n_1 by a maximum over 
probability distributions over S" 2n_1 (which of course changes nothing), and in turn approximating 
this by considering only the moments of order k. The dual of ((13)) is 

min A such that (14a) 

Cn \ m 

ZjZj - 1 J g + qtqi (14b) 
i=i J i=i 

q G C[z,z] fc _i iA; _i (14c) 
gi,...,? m eC[z 1 z] ti0 (14d) 

which can again be seen to be an SDP. This can be thought of as "proving" that p(x) < A by 
using the fact that p(x) — A is a sum of squares of polynomials; hence this SDP is also called the 
"sum-of-squares" hierarchy. 

Under reasonable assumptions, as k grows this SDP converges to max^^n-i p(z) as k 
grows Il73l |64| . However, since the effort to compute ((13)) or ((14)) grows exponentially with k, 
it is important to determine the rate at which this convergence takes place. This rate is generally 
well-understood for optimizations over the simplex, but less is known for the sphere l38l . 

At first glance, the sum-of-squares hierarchy may appear unrelated to the quantum de Finetti 
theorems studied in this paper. However, the space C[z]^ is isomorphic to the symmetric subspace 
of (C™)® fc . Moreover, the relaxation in ((13)) is tight in the cases when L approximates the evaluation 
functional (i.e. L(p) = p(z) for some z G C n ) on degree-c? polynomials which is analogous to the 
(i-body marginals being approximately product. Indeed, this connection has been explored in 141)1 , 
where the sum-of-squares hierarchy was used to prove that fc-extendable states are approximately 
separable for sufficiently large k, and in IflOll , where this connection was used to find cases in 
which the sum-of-squares hierarchy yielded a good approximation of the 2 — > 4 norm of a matrix. 

To make the connection more explicit, we define, for any convex set K, the support function 
of if by 

h K {x) := swp{x,y). (15) 

For matrices x, y we define (x, y) := tr x^y. Then part 1 of Theorem[2] directly implies the following: 
Let M be a one-way LOCC operator of the form 

M = Yl p i2,..,ii®Q2,i2®---®Qi,iv ( 16 ) 

i 2 ,...,i; 

with < Pi 2 , •••!»! — I f° r eac h ^2, • • • , k and < J2i Qj,i 3 — I f° r eacn 2 < j < I. Then 



h Sep(A®i)( M ) < h k-Ext(A®i)( M ) < h Sep(A®')( M ) + J 



2P ln\A\ 



(17) 

Given such an M and defining \z) := (zi, ... , z n ), we observe that (z® l \M\z® 1 ) is a degree-Z, / 
polynomial in z. As a result, we immediately obtain a bound on the ability of the sum-of-squares 
hierarchy to approximately certain polynomials over the complex hypersphere. 
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Corollary 7. Let p e C[z,z]^ be of the form p(z) = (z® l \M\z® 1 ) with \z) := (zi,...,z n ) and M 
described by dl6l >. Then 

max p(z) (18) 
N| 2 =i 

can be computed to within additive error e by 0(log(n)/ 2 /e 2 ) levels of the sum-of-squares hierarchy. 

Note that the result of Chen and Drucker l29l implies that log(n)/ 2_ °( 1 ) levels of the sum-of- 
squares hierarchy are not sufficient to compute even a constant-error approximation to ((18)) , for 
general p of the form described in the corollary, unless there is a subexponential time algorithm 
for 3-SAT. 

There is also more direct evidence that Corollary [7] cannot be improved to yield a PTAS for 
polynomial optimization over the unit sphere. Ref. [25] proved that for any n, there exists a local 
measurement M (derived from a Bell inequality) on n x n systems such that 

tr(M^) > fl fn\ f (19) 



h Scp (M) ~ Vlog 2 (n) 

with <!>„ the projector onto the n-dimensional maximally entangled state. Since p := ^$ n + (l — 
is fc-extendable, it follows that the fc-extendable approximation can make multiplicative errors as 

lar S e asS7 (fck#R)- 

2.5 Testing Multipartite Separability 

Another application of part 1 of Theorem |2j closely related to section 12^41 is to the quantum sepa- 
rability problem, a well-studied problem in quantum information theory of both theoretical and 
practical interest 1150) . Given a multipartite state p Al "' Al we say it is fully separable if 



P 



A 1 -A l _Y j p j of^®...®af\ (20) 



for a probability distribution {pj} and quantum states 

The goal in the weak-membership problem for separability is to decide whether a given multi- 
partite state p Al "' Al is separable or if it is e-away from any separable state, given the promise that 
one of the two alternatives holds true. In fact one has a family of problems depending on which 
norm we choose to quantify the distance of quantum states. We consider two choices of norms. 
The first is the one-way LOCC norm, defined as 

II^IIlocc- := max ||id ® A 2 <g> . . . <g> Aj(A")||i. (21) 
A 2 ,...,Ai 

The name comes from the interpretation of norm as maxj^f tr(MX), with M any POVM element 
that can be implemented by parties 2, . . . , I measuring their systems locally and communicating 
the outcome to party 1, who then performs a measurement dependent on the information re- 
ceived. Therefore we have one-directional communication from all the parties to party 1. 

The second is a multipartite version of the Forbenius norm recently introduced by Lancien and 
Winter l63ll : 

||X|| 2(0 := /^tr|t r/ X| 2 . (22) 
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Corollary 8. For some c > 0, the Sum-of-Squares hierarchy solves the weak membership problem for 
separability for the norm \\ * \\ locc^ i n ^ me 



2 



expfc^logl^lj ZV 2 j. (23) 

In turn, the Sum-of-Squares hierarchy solves the weak membership problem for separability for the norm 
|| * || 2(i) in time 

exp L ^log|^|j (I8) l / 2 l 2 e-^j . (24) 
See Section [7| for the proof. 

We note this gives a generalization of the result of |22| , which proved the same result for 
bipartite quantum states. A early generalization of [22J to multipartite states was given in l|2"TTl ; 
however there only a bound of 

exp (clog |^i| • • • log |A,|Z 2, - 1 e- 2 ( , - 1 >) (25) 

was obtained for the running time of the algorithm. 



2.6 Pretty-Good Tomography in Permutation-Symmetric States 

A final application of part 1 of Theorem [2] is to quantum state tomography, in which one obtains 
a description of an unknown quantum system by making measurements on the system. In quan- 
tum state tomography one tries to obtain a classical description of an unknown quantum state in 
the form of a density matrix for the state. By performing sufficiently many measurements of a 
sufficiently large number of different measurement settings one can obtain an arbitrarily good ap- 
proximation of the true quantum state. Typically one considers a situation in which one has access 
to many independent and identically distributed (i.i.d.) copies of an unknown quantum state, and 
one performs measurements on those copies in order to learn the identity of the quantum state. 
Mathematically we can model this situation as saying that the global quantum state is of the form 

oon = J <T® n n(da), (26) 

for an unknown measure fi on quantum states. However the assumption of having many i.i.d. 
copies of an unknown state cannot always be ensured, and in many situations it simply does not 
hold true. It is thus an important task to try to relax this requirement. It has long been realized 
[27 j that quantum de Finetti theorems are exactly the right tool here. Instead of having to assume 
that u n has the form given by Eq. ((26)) , one can merely assume that u) n is the reduced state of a 
larger permutation-symmetric state oj n +k- Then for k sufficiently large ui n will be close to a convex 
combination of i.i.d. states. The point is that one can easily ensure the latter situation by selecting 
n subsystems at random from the n + k available ones. 

The state of affairs is more complicated once complexity is taken into account: Given a quan- 
tum state of / qubits one must generally collect the statistics from 2°w different measurement 
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settings in order to be able to reconstruct from the measurement data a density matrix that is a 
faithful representation of the quantum state. This exponential complexity is to be expected since 
after all a quantum state of I qubits contains an exponential number of independent parameters. 
However in many cases most of these parameters are irrelevant for the kind of information one 
would like to extract from the quantum state. For instance if all one cares about are expectation 
values of single-qubit observables, then only a linear number of parameters suffices. Is there a 
way to explore this intuition in order to construct more efficient tomographic schemes? 

One beautiful result in this direction was obtained by Aaronson in Ref. |1), using tools from 
computational learning theory I155H , and can be roughly stated as follows: Given an arbitrary dis- 
tribution M. over measurements and an unknown quantum state on I qubits, 0(1) measurements 
settings are sufficient to get a density matrix which, with high probability over the measurement 
choice from M., agrees with the expectation of the true quantum state up to small error. Thus 
a linear - in the number of qubits of the state - number of measurement settings are enough to 
get a density matrix which gives a good estimate to the statistics of the true state for almost all 
choices of measurements; one can perform a "pretty-good" tomography just with a linear number 
of measurement settings. The formal statement of Aaronson's result is as follows, restated slightly 
in order to facilitate our later extension of the result. 

Lemma 9 (Theorem 1.3 of [1]). Let uj m+n e v(H® m+n ) be a state of the form 

U m+n = J is(dp)p® m+n , 

for a probability measure v onV(7-L). Let Mbe a distribution over two-outcome measurements on % and 
£ = (Ei, . . . , E m ) a training set of independently sampled measurements from M. Suppose we measure 
the first m systems ofoo according to £ and obtain outcomes B = (b±, . . . , b m ) £ {0, For any outcome 
B, we will choose a hypothesis state 

in 

a B ■= argminy^tr^o-) - bi) 2 . 

a ' 

i=l 

Then there exists a constant K > such that if 

^ K /log \U\ . 2 1 i !\ 

then with probability at least 1 — 5 the post-measured state uj n satisfies 

u n = J p® n p(dp), 

where the measure p, only has non-zero support on states p such that 

Pr [\tr(Ep)-tr(Ea B ) \ > 7] < s. 

A limitation of Aaronson's result [lj, common of other tomographic schemes as well, is the 
assumption that one is given several i.i.d. copies of the unknown quantum state. Here too one 
could try to apply the standard quantum de Finetti theorems l62ll32l[80l to find a way around this 
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(27) 

(28) 

(29) 
(30) 



assumption. However since the error in those depend polynomially on the dimension of the state, 
one would obtain a non-trivial result only if one would select subsystems at random from a state 
of 2°W subsystems, which is not a reasonable assumption. Theorem [2] allows us to circumvent 
this problem. 

Corollary 10. Let uj m+n+ k G x>(H® m+n+k ) be a ■permutation-symmetric state, let M be a distribu- 
tion over two-outcome measurements on %, and let £ = (Ei, . . . , E m ) be a training set consisting of m 
measurements drawn independently from M. Suppose we discard the last k systems, measure the first m 
systems of 00 according to £ and obtain outcomes B = (bi, ... , b m ) € {0, l} m . For any outcome B, we will 
choose a hypothesis state 

m 

a B := argmin V(tr(£ iC r) - bi) 2 . (31) 

a ' 

1=1 

Fix error parameters e, 7], 7, v > 0. Suppose that (for some universal constant K > 0) we have 

K ( \og\U\ 2 1 1\ 

k > 4(m + f ln| ^'. (33) 
Then with probability at least 1 - 5 the post-measured state uj n satisfies 

Ai®...<g>A n (w n - / P m p(dp) 



max 
Ai,...,A„ 



< v, (34) 
1 



with the maximum over quantum-classical channels Ai, . . . , A n . Here the measure p only has non-zero 
support on states p such that 

Ft [\tx{E P )-tr{Ea B )\ > 7] <e (35) 
EeM 

The proof of Corollary [10] follows immediately from part 1 of Theorem|2]and Lemma|9] 
Let us say a few words about the interpretation of the result. Suppose we had Eq. ((34|) with 
v = 0. Then 

u n = [ p® n p(dp), (36) 



with p a measure with non-zero support only on states p that, for most measurements on M., gives 
approximately the same statistics as any state ctb compatible with the observed data (in the sense 
that it satisfies Eq. (|3~T)| ). Therefore any state as compatible with the measured data can be used 
correctly to infer the statistics of future measurements, with high probability over the choice of 
the observable. For non-zero v we have a similar situation. While the state £j n might be very far 
away from a convex combination of i.i.d. in trace norm, if we only consider the statistics of local 
measurements on the n subsystems, then, up to error v, we have the same conclusions as in the 
case of v = 0. 

The price we have to pay for being able to relax the assumption of having i.i.d. copies of the 
state is that instead of starting from 0(log \ %\) + n copies of the state, now we need a global state 
with O ((0(log \H\) +n) 2 log \H\) subsystems (of which we only measure 0(log \H\) of them). The 
main point is that this is still polynomial in the number of qubits of the unknown state one wants 
to learn. 
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We note that while this approach gives an efficient alternative for tomography of states on a 
large number of qubits in what concerns the number of measurements needed, it says nothing 
about the computational complexity of finding the hypothesis state ctb- As noted in [1), it is an 
interesting problem to determine for which classes of states one can obtain p v efficiently. 

3 Proof of Theorem Q] 

We will prove Theorem [T]by information- theoretic techniques, inspired by (Till and Lemma 4.5 of 
IZll. Given two quantum states p, a E T>(H), we define the quantum relative entropy (or quantum 
Kullback-Leibler divergence) as 

S{p\\a):=tT{p{Hp)-Ha))). (37) 
Given a bipartite state p AB E T>(A ® B) we define the mutual information as 

I(A:B) p :=S(p AB \\p A ®p B ). (38) 

Given a tripartite qqc state of the form p ABK := ^ k PkP AB <8> \k) (k\ K we define the conditional 
mutual information as 

I(A:B\X) p :=J2PkI(A:B) Pk . (39) 
k 

The mutual information satisfies the following properties that will be useful in the proof: 
Lemma 11. 

1. Chain Rule: 

I (A : BX) = I (A : X) + I(A : B\X) (40) 

2. Monotonicity under Local Operations: Let ttab = id ® A(p AB ), then 

I (A : B) n < I(A : B) p (41) 

3. Pinsker's Inequality: 

I{A:B) p > l -\\p AB -p A ®p B \\l (42) 

(The absence of the usual In (2) factor in ((42)) is because of our convention that entropies are 
measured in "nats," i.e. with logs taken base e.) 
We are now ready to prove Theorem Q] 

Theorem [J (restatement). 

2. Let p AB E V(A & B) be a k-extendible state and p(m) a distribution over quantum operations 
{A A , m } m , with A Am : V(A) -> V(X). Then 



min max E \\A A , m 8) Ag (p AB - <J AB ) IL < J 21n J X l . (43) 
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2. Let p AB e V(A ® B) be a k-extendible state, /i(m) a distribution over operators {A^ jm } m from 
T>(A) —?■ V(X) and Ab a measurement on T>(B). Then in time poly(| j4| , \B\ k ) a classical computer 
can compute a € Sep(A : B) such that 

E HA^SAfl^-*^) 11^^93. (44) 

3. Let p(x, y\a,b) G XxyxAxBbea k-extendible non-signaling conditional probability distribution 
and let pbea distribution over A. Then 



min max E \\p(x,v\a.b) — q(x,y\a,b)\\-, < 

qeLHV b£B a~/i 1 



Proof. The three parts of the theorem have similar proofs. 
Part 1: 

Consider the state 



TTAB^.BkM ■= E (A A , m ® A Bl A Bk ) (p ABl - Bk ) ® \m)(m\ M , (46) 

with AA,m quantum operations from A to X, A^, quantum-classical channels, and \m) a classical 
label for which quantum operation A^ m was applied. Repeatedly applying the chain rule (|40)) , 
we find 

I(A : B\... B k \M) = I (A : Bi\M) + I (A : B 2 \MBi) + . . . + I(A : B k \MB x . . . (47) 

Now we maximize over measurements and obtain 

max I (A : B x ■ ■ ■ B k \M) n = (48) 
A Bl ,...,A Bk £M 



max (I(A:B 1 \M) w + ... + I(A:B k _ 1 \MB 1 ...B k _ 2 ) 7r + max I(A : B k \MB 1 . . . B k ^] 
A Bl ,...,A Bk l eM \ h B k £M 

Now 

I(A: B k \MB 1 ...B k ^ = E I{A: B k \B x . . .B k ^ m (49) 

with 7r m := (A^ jjn <8) Ab 1 ® • • • <8> As fc ) (p AB i- B k^ Since the f?i . . . systems of 7r m are classical, 
we can write the state of p ABk as an average over them, namely 

P AB = P AB *=Y:<iiPt Bk , (so) 

i 

where {qi,Pi} depend on Ab 1 , • • • , Ab,. _ ± but not on A^ )m and A# fe . Then define 

4Z k --=^A,m®A Bk )(pi Bk ), (51) 



so that vr^ = E 4 %<^and 

I(A : fl fc |J3i . . . B fc -i) ffm = 2 : 5 *)*<, m (52) 
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By Pinsker 's inequality 

I{A : B k \B x . . . B k _ x \ m >\^q% II A Am <8> A Bfe 



A B 

where pf and p, k are the A and B k reduced states of pi. 



pi- Pi ®Pi k 



By convexity of x and the trace norm 



I{A : B k \Bi, . . . ,B k _i) nm > - 



Using Eq. @9 



(53) 



(54) 



max /(A : Bk\MBi ...B k _x)^ > - max E 



Ai, m 8) A Bfc p ABfc - Y QiPf ® Pf 



2 

(55) 
l 



> - min max E IIA^ m <g> A B ,. (p 
2 CT eSEP(A:B fc ) AB k &M m~At" 



Note that the second line is independent of A^, . . . , As fe _ 1/ since only the ensemble p^} de- 
pended on them. 



From @8l) and (55), 



max /(A : J3i...£JM) 7r 

A Bl ,...,A Bfc 6A4 



(56) 



> max V /(A : J3,-|Af J3i . . . 

" A fll ,...,A flfc _ ie A4^ Jl J 

1 2 

+ - min max E | m ® A Bfc (p - cr) | L . 

Applying the same argument sequentially to all the remaining conditional mutual information 
find 

min max E II A A m <8> A B (p AB - <J AB ) II? < max /(A : Bi . . . B^M)* < In |X| 
2 CT eSEP(A:_B) A s eMm~/i 11 ' vr yM1 A Sl ,...,A Sfe 

( 

where we used that tt a = E m ^ M (A At7n (p A )) G T>(X). Finally by convexity of x 2 , 



we 



min max E II A A m g> A s (p AB - a AB ) \\^\ < 

a&SEP{A:B)A B eMm^ fJ , u ' v /Mi / 



21n|X| 
k 



(58) 



and we are done with the proof of part 1. 

Part 2: The proof of part 2 is mostly the same as that of part 1, and so we only give a brief outline 
of the changes. The main change is to omit the maximizations over Ai, . . . , A k , instead using only 
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the fixed measurement A#. We also set a = qipf (g> p i k rather than performing a minimization. 
As a result, the calculations require only time polynomial in the dimensions of the relevant states. 

Part 3: The proof of part 3 is similar to that of part 1, except that we need to make the following 
replacements. 

Part 1 Part 3 

quantum states p AB ^- B k non-signaling distributions p(x, yi, . . . , yk\a, b\, . . . , b^) 

quantum mutual information classical mutual information maximized over choices of 

measurements a,bx,. . .bf. 

partial trace no-signaling condition 

For brevity we will use the abbreviations b k := (pi,... , bk), b k ~ 1 = (pi,..., bk-i) and so on. In 
more detail, the analogue of (|46)) is to define the non-signaling distribution ir from B k — > X x y k : 

tt(x, y k ,a\b k ) = n(a)p(x, y k \a, b k ) (59) 

We can also define 

Trfoj/*- 1 ,^- 1 ) =M(a)p(x,y fe - 1 |a,6 fe ), (60) 

and, thanks to the no-signaling property of p, this is well-defined, since the RHS does not depend 
on b k . 

Again the chain rule gives us an analogue of ((48]). 

max : I(X : Y k \A)^ m 

= , ? ax , , [ E J ( X : Y My j ~\(-\b^)+^I(X : Y k \AY k -\ { . m ) (61) 

\J = 1 / 
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Again we focus on the last term of Eq. (|6"l~]) . Define i := (a, fo fc , y k 1 ), and compute 

max/(X : Y k \AY k ~\ { . ]bk) = max E I(X : Y k \Y k -% { .^ bk) 
= max E ^ p(/- 1 |6 fc - 1 )J(X : Y k ) p( . {l) 



-max E V pfy* -1 ^*- 1 ' 



/ 



> — max E 

2 fe fc eB 



> — max E 

2 6 fc 6B 



1 



Y \p( x >yk\i) -p( x \i)p(yk\i)\ 



Pinsker 



J2p(y k 1 \b k x ) \p( x ,Vk\i) -p(x\i)p(y 



k\l) 



V 



\y k ey 



y k &y 



p(x,y k \a,b k ) -^2p(y k l \b k l )p{x\i)p{y 



convexity of x \-t x 1 



convexity of || • ||i 



>^ min max E \\p(X, Y k \a, b) - q(X, Y k \a, 6)||? 

As with part 1, we can repeatedly apply this inequality to (|6T)) in order to prove the theorem. 

□ 

4 Proof of Theorem [2] 

For a state p A ^- B k we define the multipartite mutual information 

I(A 1 : . . . : A k ) := S{p M - A *\\p M ® . . . ® = + . . . + - 5(^i . . . A k ). (62) 

(i)^!^ we define the conditional multi- 



For a quantum-classical state pAi— A kR — YliPipf 1 Ak 
partite mutual information as follows 



I(A 1 A k \R) p := Y^Pi 1 ^ : • • • : A k) Pi - 



(63) 



The multipartite mutual information satisfies the following properties: 
Lemma 12. 

1. Multipartite-to-Bipartite Jlggty : 

I(A 1 : . . . : A k \R) = I(A X : A 2 \R) + I(A 1 A 2 : A 3 \R) + . . . + I(A 1 . . . A k _ x : A k \R). (64) 

2. Monotonicity under Local Operations: Let K Ax '" Ak = A.a 1 <S> id A2 ' Ak (p Al - Ak ), then 

I(A 1 A k ) n < I(A X A k ) p (65) 
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3. Pinsker's Inequality: 



I(A 1 



A k) P > ^\\p 



A!...A k M 



P 111- 



(66) 



Theorem[2] (restatement). 

1. Let p A i- A k £ V(A® k ) be a ■permutation-invariant state. Then for every < I < k there is a measure 
v on V{A) such that 



max 
A 2 ,...,A ; e.M 



(id ® A 2 <g> . . . ® Ai)[ p 



v{da)a® 1 



< 



2P ]n\A\ 
k-l 



(67) 



2. Let p(Xi ■ ■ ■ Xf.\Ax ■ ■ ■ Af.) be a permutation-invariant non-signaling conditional probability distri- 
bution (i.e. p is invariant under simultaneous permutation of the X and A systems). Fix a product 
distribution p = p\ (g> ■ ■ ■ (g> pk on A\ x • • • x A^. Then for every < I < k there is a measure v on 
single-system conditional probability distributions such that 



E 

ai,...,a;~/i 



p{Xi ■ ■ ■ Xi\ai, . . . ,a t ) - E q(X 1 \ai) 



q(Xi\cn] 



2l 2 ln\X\ 



Proof. 
Part 1: 
Let 



7T 



Ai...A,R . 



lR := (id Al ® A 2 <g> . . . <g> A; ® ^m-^)^-^), 



(69) 



with Aj : V(A) -> V(X) and £ : V(A^ k ~ l ) -> V(R) quantum-classical channels. Then from 
Eq. dSU) of Lemma [H 



inin^max I(A\ : . . . : Ai\R) n = rmn^max I(A\ . . . : Aj\R) 7 



3=2 
/ 



< min max > /(Ai . . . A,-_i : -/L-li?),,.. 
" £ A 2 ,...,A,f^ V 3 Jl Jn > 

3=2 



(70) 



with 



Tij := (id^i-A-i ^ Aj ® id^+ 1 - Ai 8> ^+1-^)^1-^). (71) 

The last inequality in Eq. (f70]> follows by the monotonicity of the mutual information under local 
operations (Eq. ((65|) of Lemma 112)) . Then 

min max I(A\ : . . . : Ai\R) w < min max /(^4i ■ ■ ■ Aj—i : A^i?),^. 



min^J max I{A\ . . . Aj—i : A^i?),^ 



3=2 J 

< mJ(I-l)max/(ii...i i _ 1 : A^R)^ , (72) 
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where the last inequality follows from the monotonicity of mutual information under tracing out 
and the permutation invariance of the state p Al - A k, 
We claim that that 

min max I(A X . . . A^x : Ai\R) Vt < (/ ~ ^ ^ ■ (73) 

Indeed, defining p A i— A k •— ^ A i— A i-i (g) Aj ® . . . ® Afc)(p j4l -" 4fe ), for quantum-classical channels 
Aj, we have 

max I(Ai . . . Ai-i : A x . . . A k ) u 

k 

= max y S^I{A 1 . ..Ai-x : Aj\A j+1 . ..A k ) v 

3=1 

( ^ • • • A l-i : A j\ A j+i ■ ■ ■ A k)v + maxima . . . : • • • A k )\ 

\j=l+i 1 J 

I V /(Ai . . . Ai_x : Aj\A j+1 . . . A k ) u + minmax/(Ai . . . Ai-i : Ai\R) n J , (74) 

\ — £ A? / 

\j=i+i / 



max 

A i+ i,...,Afc 



> max 
Ai + i,...,A fc 



where in the last inequality we used the definition of the state 717 given by Eq. ((71]) and the mono- 
tonicity of mutual information under local operations. Iterating the argument we find 

max I(A X . ..Ai-x ■ A i ■ ■ ■ A k ) u > (k — l + 1) min max I (Ax . ..At-! : Ai\R) wl , (75) 

Ai,...,A fc £ A t 

and obtain Eq. ((73)) from the bound (I — 1) In \A\ > I (Ax ■ ■ ■ Ai-i ■ Ai . . . A k ) u . Combining it with 
Eq. ((72)) we get 

min max I (Ax : . . . : A\R) n < ^i&M . (76) 
£ A 2 ,...,Ai ft — i + 1 



We now show how to combine this bound with a few properties of the measure I (Ax : . . . : 
A\ | R) to complete the proof. We have 

I (Ax Ai\R)it = £>* J (A : . . . : Aj)^, (77) 

i 

with 7Ti := (id Al <g) A2 (E) • • • <8) A;)(pj), for an ensemble such that each pj G 2?(t4® ? ) is 

permutation-invariant and YliPiPi = P • Then, by Pinsker's inequality (Eq. ((66)) ) and the 
convexity of x 2 : 



min max I(Ax : • • • : Ai\R) n > — 
£ A 2 ,...,A; 2 



(id Al ® A 2 ® • • • ® A|) [ p Al - A ' - J^p l pf 1 ®...®pf l 
Part 1 of the theorem follows from Eq. (f76)) . 
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Part 2: Letp(xi, . . . ,Xk\a±, . . . , a&) be a permutation-symmetric non-signaling distribution and /U = 
Hi x • • ■ x a product distribution on x A^. We will use the abbreviations X<i := X\ . . . Xi^i 
and X>i := X l+ i . . . X k . 

i 

min E IlXf. ■■■ : X i \X > i) p = min E V i"(X<,- : XAX^p (78a) 

a i+1 ,...,a fc ai,...,a ; a t+1 ,...,a k ai,...,a ; 

J=2 



mm 



V E /(X^- : Xjl^Op (78b) 

a ;+ i,...,a fc ^—^ ai,...,a ; 
J=2 

< (/ - 1) min E I(X <t : X,|X>,) P (78c) 

a l+i v; a fc ai,...,a ; 

To derive the last inequality, observe that I{X < j : Xj|X>/) = /(.X^-X^X^) < I{X < i\ Xi\X > i), 
where the equality is from the symmetry of p and the inequality is from the monotonicity of mu- 
tual information under tracing out systems. 
Next, 

k 

(Z — 1) In I Jf] > min V E I(X Kl : Xj|X>j) p (79a) 

a t+1 ,...,a k t-^ ai,...,ai 
3=1 

k 



mm 



V E I(X <l :X j \X >j ) p + E J(X <Z :X l \X >l ) p \ (79b) 

^ — * n • ii m it • I 



ai +1 ,...,a k \ ■< — ' ai,...,ai-i ai,...,a; 
\j=l+l 



k 

> min V E I(X <l :X j \X >j ) p + min E I(X<, : X,|X>,) P (79c) 

ai,...,Oi_i a t+1 ,...,a k a lt ...,ai 

3=l+± 



Iterating, we find that 

min E I(X <r .X l \X >l ) p <^^^ (80) 
a l+ x,...,a k ai,...,ai K — t + 1 

min E I(X X : ■ ■ ■ : X|X>,) P < ( / - 1 )' ln l X l using ggj (81) 

a l+1 ,...,a k Oi,...,a( fc — t + 1 



Fix a; + i, . . . , afc achieving the minimum in Eq. (|81j) . Using the non-signaling property, we can 
decompose 

p(X<;|^<j) = ^2p(x > i\a > i)p(X<i\A<i,a >l ,x >l ). (82) 

The astute reader will realize that it is now time to deploy Pinsker's inequality (Eq. ((66])). Along 
with Eq. ((Hi]) and the convexity of x 2 , this concludes the proof of the theorem. □ 



5 Proof of Corollary IH 

The first lemma is an adaptation of a similar result of Kempe, Kobayashi, Matsumoto, Toner, 
Vidick (56l . It shows that by symmetrizing the questions and answers of a subset 5 of the players 
one can without loss of generality assume that the players follows a symmetric strategy (in the 
case of classical, entangled, or non-signaling strategies) 
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Lemma 13. Let G(N,tv,V) be a non-signaling-prover game such that ir(ii, . . . ,in) is symmetric in 
ii,...,i m and V is symmetric under simultaneous permutation of registers 1, ... ,m of the questions 

Qh,...,i N an d of the answers Ojj ^jw 1 m < N. Then given any strategy given by a non-signaling strategy 

that wins with probability p, there exists a symmetric strategy with respect to provers 1, . . . , m. 

The next lemma gives a hardness of approximation result for approximating the classical value 
of free games. 

Lemma 14 (Aaronson-Impagliazzo-Moshkovitz-Shor |3j). 3-SAT with n variables can be reduced to 
the problem of obtaining a constant error approximation to oo c (G)for two-player one-round free games with 
2°(v^ -sized output alphabet. 

Corollary |H (restatement). 

1. Let G(2, 7r, V) be a two-player one-round non-local free game with tt a product probability distribu- 
tion on RxQ and V a predicate on RxQxAxB. Then there is a (m+\)-player one-round non-local 
game G(m + 1, W, V) with If a probability distribution on R x Qi x . . . x Q m , with = \Q\for 
k G [m], and V a predicate on R x Qi x . . . x Q m x A x B± x . . . x B m , with \Bj.\ = \B\ for 
fee [m], such that 



2. For a free game G(2, it, V) there is a linear-programming relaxation of size \R\\A\ (\Q\\B\)^ r for 
computing uj c (G) to within additive error e. 

3. One can reduce 3-SAT on n variables to computing uj e {G) to within constant additive error for 
0{s/n)-player one-round non-local games with answer alphabet size o/exp(0( v / n)) in which only 
two players are asked questions. 



Part 1: Define a game G in which the verifier chooses a pair (r, q) from the distribution n(r, q) and 
sends r to the first prover (let us call it Alice) and the q to one of the other m provers chosen at 
random (let us call them Bob 1 to Bob m). The verifier does not send a question and does not 
expect an answer from the remaining Bobs. Then the verifier uses the answers obtained from 
Alice and the chosen Bob to compute V(a, b\r, q). Applying LemmalT3~1to the case of non-signaling 
games we can restrict the parties to use non-signaling distributions which are symmetric on the 




(83) 



In \A\ 



Proof. 



Bobs. Thus 



(G) = sup^vr(r,gr) ^ 



1 



y~V(a,& fc |r,g fc ) p(a, h, . . . , b m \r, qi, . . . , q m ) 



m 



q,r a,b\,...,b, 



k=l 




(84) 
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where the supremum in the last line is taken over all m-extendible non-signaling distributions p. 
Then by Theorem Q] 

sup ^2 vr(r, q) ^ V(a, b\r, q)p(a, b\r, q) 
pem-Ext q r , 



< sup V7r(r,g) Y]V(a,b\r,q)s(a,b\r,q) + \\J ^^- - 
^HV^ 2V m 

In more detail, since the game is free we have that ir(r, q) = Tri(r)iT2(q). Then 



(85) 



q,r a,b 

< E 7ri(r) E 7r2(9) \\p(a, b\r, q) - s{a, b\r, q) ^ 

< E n r r) max\\p(a,b\r,q)-s(a,b\r,q)\\ 1 . (86) 



From theorem Q] 



mm E ni{r) max \\p(a,b\r,q) -3(0,6^,?)^ < lw 21 " A . (87) 
seLHV v ' <je<3 2 V m 



Parf 2: Follows from part 1 and the fact that uj ns can be computed by a linear program IIBTH . 

Part 3: Follows from part 1 of this Lemma and part 1 of Lemma IT4l □ 

6 Proof of Corollary [6] 

We begin with a definition of analogues of QMA with multiple unentangled proofs. 

Definition 15. A language L is in M-QMA n (m,s,c) is there exists a polynomial-time implementable 
two-outcome measurement {M X ,I — M x } from the class M such that 

1. Completeness: Ifx e L, there exist m proofs \4>i), ■ ■ ■ , \i> m )> ea ch ofn qubits, such that 

tr (M x (|Vi)(^i| ® ■ ■ ■ ® l^mX^ml)) > c (88) 

2. Soundness: Ifx £ L, then for any states \4>i), • • • , IV'fc)/ 

tr(Af a .(|Vi>(^i|®...®|Vm)<V' ro |)) < c (89) 

If M z's ffae class of all polynomial-time implementable two-outcome measurements we denote the com- 
plexity class simply by QMA n (m, s,c). 

Some examples of classes of measurements that we consider in this paper are: 
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1. Bell is composed of measurements ^ M ■< I of the form 

M = M 1>h ® . . . ® M m , im (90) 

(il,...,i m )6S 

where M^j = / for all j G [m], and 5 is a set of m-tuples of indices. In words the 
m subsystems are measured locally giving outcomes . . . , i m ) and the verifier accepts if 

(h, ...,i m )eS. 

2. LOCCi is composed of measurements of the form 

M = M lti ® . . . <g> M m>i (91) 

i 

such that ^ M hi < I for all i, and < ^ M kji < I for every k G {2, . . . , m}. 

3. SEP is composed of measurements < M < I such that 

M = M i,i ® • • • ® (92) 

i 

for positive semi-definite matrices M^j. 

See B6H for more examples of classes of measurements as well as relations between them. 
We will also make use of QMA with multiple identical proofs: 

Definition 16. A language L is in M-SymQMA n (m, s, c) is there exists a polynomial-time implementable 
two-outcome measurement {M X ,I — M x } from the class M such that 

1. Completeness: Ifx€L, there exist a proof \ip) ofn qubits such that 

tr(M x |V)(Vr m ) >c (93) 

2. Soundness: If x £ L, then for any state \ip), 

tr(M x |V}(Vr m ) <c (94) 

We now turn to the proof of Corollary [6J 
Corollary [6] (restatement). 

1. BellSymQMA„(m, c, s) C QMA 10n 2 m 2/ e 2(c, s + e). 

2. For every e > and c - s = f2(l), fere is no BellSyrnQMAQ^g^-^n^ -2 , c, s) protocol for 3-SAT 
an'f/z n variables and 0(n) clauses, unless 3-SAT can be solved in exp(n 1-2e polylog(ra)) time. 

3. BellQMA n (m, c, s) C QMA 10n 2 m 3 /e 2 (c, s + e). 

4. QMA poly(n) (§, i) = BellQMA poly(n) (poly(n), §, ±) 
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Proof. 

Part 1: To simulate a BellSymQMA n (m, c, s) protocol in QMA 10n 2 m 2 / £ a(c, s + e) the verifier receives 
the proof of 10n 2 m 2 /e 2 qubits from theprover and consider it as Wnm 2 /e 2 blocks of n qubits. Then 
he symmetrizes all the blocks, traces out all of them except the first m blocks and runs the original 
Bel I Q MA protocol on them. It is clear that completeness is not changed. To analyze soundness we 
use part 1 of Theorem|2] 

Part 2: Follows easily from the previous part. 

Part 3: To simulate a BellQMA n (m, c, s) protocol in QMA 10n 2 m 3 / 6 2(c, s + e) the verifier receives the 
proof of 10n 2 m 2 /e 2 qubits from the prover and consider it as Wnm 2 /e 2 blocks of nm qubits. Then 
he symmetrizes all the blocks, traces out all of them except the first m blocks. Then the divides 
each of these m blocks into m sub-blocks of n qubits. Let us denote the i-th sub-block of j-th 
block by X^j. Then the verifier runs the original BellQMA protocol using the state in subsystems 
-^1,1; ^2,2, • • • , ^m,m as a proof. It is clear that completeness is not changed. To analyze soundness 
we use part 1 of Theorem [2] 

Part 4: Follows easily from the previous part. 

□ 



7 Proof of Corollary E 

Corollary [8] (restatement). For some c > 0, the Sum-of-Squares hierarchy solves the weak membership 
problem for separability for the norm \\ * \\ locc^ in time 



exp c log | A,- 1 I 



V 2 



(95) 



In turn, the Sum-of-Squares hierarchy solves the weak membership problem for separability for the norm 
|| * || 2 m in time 



cxp 



(96) 



Proof. Given a state p Al >-> A i g V(A\ ® ■ ■ ■ <g) A{) the algorithm consists of searching by semidef- 

inite programming for a state a xl "' xk , with := x( ■ ■ ■ X\, with x\ = Aj for all i, j, such that 

X\,...,X) — „Ai,...,A 



a 



- p^i,—,^i_ w e choose k = I + 4l 2 e 2 J2j l°g l^il- Then by Theorem|2]we have that there 



exists a measure v on V{A\ ® ■ ■ ■ <g> A{) such that 

max (id ® A 2 ® ■ ■ ■ ® Aj) ( p xl - xk - / v(da)o® 1 
A 2 ,...,AieM V J 

Then from monotonicity of the trace norm under partial trace, 

min max (id <g> A 2 <g> • • • <8) AA ( p x l'" x k - a ) 

aeSep(Ai:-:Ai) A 2 ,...,A l eM V / 



< e. 



< £. 



(97) 



(98) 
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Given a separable p A ^--> A i l we can find an extension a xl '" xk for every k. Conversely, Eq. ((98|) 
shows that if p Al >~> A i is e-away in one-way LOCC norm from separable, then it does not have an 
extension for k = I + 4/ 2 e -2 ^T, ,■ log 

The bound for || * \\ 2 m follows from the reasoning above and the following bound (given by 
Theorem 5 of 1551 ): 

imkocc- > 18-'/ 2 ||X|| 2(0 . (99) 

□ 

8 Open Problems 

It would be desirable to strengthen several of the results in this work: 

1. Conjecture |5] is a proposed improvement of Theorem [T] that would imply that O(log(k))- 
extendable states cannot be distinguished from separable states by Bell measurements with 
k outcomes per party. As we discuss in Section W2\ this would have a very interesting appli- 
cation to the complexity of non-local games. 

2. We would also like to improve Theorem [TJ to apply to separable measurements instead of 
merely 1-LOCC measurements. If this were true, it would imply by the results of f46j, that 
QMA n (m, c, s) C QMA ( mn 2 / e )(l, c,s + e). It would also yield quasipolynomial-time classi- 
cal algorithms for separability testing and a large number of tensor optimization problems 
described in 061 . 

3. One of the few barriers to improving de Finetti theorems is the example of the maximally 
mixed state on the antisymmetric subspace of C d (g) C d Il32l . This so-called "universal coun- 
terexample" state is (^-extendable, and yet is far from separable. However, this distinguisha- 
bility cannot be achieved by a measurement whose "not separable" outcome is itself a sep- 
arable measurement operator; aka a "SEP-YES" measurement. By ||46| , proving a more ef- 
ficient de Finetti theorem against such measurements would improve the algorithm for ap- 
proximating hs e p(M) for general measurements M. Additionally, the antisymmetric state is 
not PPT, and such examples of highly-extendable far-from-separable states are not known 
to occur when we add the PPT constraint, as proposed by |4"H . Intriguingly, the "worst" 
known example (i.e. most extendable while being far from separable) of a PPT state is only 
0(log d) -extendable [241 El- It would be of great interest either to prove a better bound on the 
combination of PPT and ^-extendable constraints (see llTOH or Section 9.3.2 of IflOll for some 
progress), or to find better counterexample states. 

4. It would also be interesting to use our information-theoretic techniques to examine the vari- 
ous extensions of the de Finetti theorem. For example, is there a version of the post-selection 
technique [31 J where the dimension dependence is replaced by a dependence on the number 
of measurement outcomes? One difficulty here (highlighted by taking the local dimension 
to be infinite) is in choosing the right test state upon which the channels should act. Another 

9 In more detail this follows by considering a variant of the example of (24l (page 6) in which the EPR pair is replaced 
by a constant-dimensional PPT entangled state. Note that by 1 14 1 for every e > there is a PPT state with trace distance 
2 — e from separable states, thus for every e > one can get a PPT 0(log(d) )-extendible dxd state which is (2 — e)-away 
from any separable state. The same holds true if we use the 1-LOCC version of the trace norm. 
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question is whether our techniques can improve the exponential de Finetti theorem ||79| . Un- 
fortunately, this theorem is known not to have a classical analogue (due to unpublished work 
of Christandl and Toner), while our proofs use entropic properties of classical, or classical- 
quantum, states. 



Acknowledgments 

We are grateful to Thomas Vidick for many helpful comments on an early version of the paper, to 
Scott Aaronson for telling us about O in 2010, and especially to Boaz Barak, Jon Kelner and David 
Steurer for sharing with us an early version of IfTTll . We also benefited from interesting discussions 
with Matthias Christandl and Stephanie Wehner. FGSLB acknowledges support from the Swiss 
National Science Foundation, via the National Centre of Competence in Research QSIT AWH was 
funded by NSF grants 0916400, 0829937, 0803478, DARPA QuEST contract FA9550-09-1-0044 and 
the IARPA MUSIQC and QCS contracts. 



References 

[1] S. Aaronson. The learnability of quantum states. Proc. R. Soc. A, 463:2088, 2007. arXiv:quant- 
ph/0608142. 

[2] S. Aaronson, S. Beigi, A. Drucker, B. Fefferman, and P. Shor. The power of unentanglement. 
Theory of Computing, 5(l):l-42, 2009. arXiv:0804.0802. 

[3] S. Aaronson, R. Impagliazzo, D. Moshkovitz, and P. Shor. AM with multiple Merlins, 2012. 
in preparation. 

[4] S. Arora and B. Barak. Computational Complexity: A Modern Approach. Cambridge University 
Press, 2009. 

[5] S. Arora, B. Barak, and D. Steurer. Subexponential algorithms for unique games and related 
problems. In FOCS, pages 563-572, 2010. 

[6] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and the hard- 
ness of approximation problems. /. ACM, 45(3):501-555, May 1998. 

[7] S. Arora and S. Safra. Probabilistic checking of proofs: a new characterization of np. /. ACM, 
45(1):70-122, Jan. 1998. 

[8] A. Aspect, J. Dalibard, and G. Roger. Experimental test of bell's inequalities using time- 
varying analyzers. Phys. Rev. Lett., 49:1804-1807, Dec 1982. 

[9] L. Babai, L. Fortnow, and C. Lund. Non-deterministic exponential time has two-prover inter- 
active protocols, computational complexity, 1:3-40, 1991. 

[10] B. Barak, F. G. Brandao, A. W. Harrow, J. Kelner, D. Steurer, and Y. Zhou. Hypercontractivity, 
sum-of-squares proofs, and their applications. In STOC '12, STOC '12, pages 307-326, 2012. 
arXiv:1205.4484. 



32 



[11] B. Barak, J. Kelner, and D. Steurer. Iterative rounding for sum-of-squares relaxations, 2012. 
unpublished manuscript. 



[12] J. Barrett and M. Leifer. The de Finetti theorem for test spaces. New f. Phys., 11:033024, 2009. 
arXiv:0712.2265. 

[13] S. Beigi. NP vs QMA_log(2). Quant. Inf. Comp., 10(1&2):0141-0151, 2010. arXiv:0810.5109. 

[14] S. Beigi and P. Shor. Approximating the set of separable states using the positive partial 
transpose test. /. Math. Phys., 51:042202, 2010. 

[15] J. Bell. On the Einstein-Podolsky-Rosen paradox. Physics, 1:195, 1964. 

[16] M. Bellare, U. Feige, and J. Kilian. On the role of shared randomness in two prover proof 
systems. In ISTCS '95, pages 199-208, 1995. 

[17] C. Bennett, E. Bernstein, G. Brassard, and U. Vazirani. The strengths and weaknesses of quan- 
tum computation. SIAM Journal on Computing, 26:1510-1523, 1997. arXiv:quant-ph/9701001. 

[18] C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wootters. Mixed-state entanglement 
and quantum error correction. Phys. Rev. A, 52:3824-3851, 1996. arXiv:quant-ph/9604024. 

[19] H. Blier and A. Tapp. All languages in NP have very short quantum proofs. In First Inter- 
national Conference on Quantum, Nano, and Micro Technologies, pages 34-37, Los Alamitos, CA, 
USA, 2009. IEEE Computer Society arXiv:0709.0738. 

[20] F. Brandao. Entanglement Theory and the Quantum Simulation of Many-Body Physics. PhD thesis, 
Imperial College London, 2008. arXiv:0810.0026. 

[21] F. Brandao and M. Christandl. Detection of multiparticle entanglement: Quantifying the 
search for symmetric extensions, 2011. arXiv:1105.5720. 

[22] F. G. Brandao, M. Christandl, and J. Yard. A quasipolynomial-time algorithm for the quantum 
separability problem. In Proceedings of the 43rd annual ACM symposium on Theory of computing, 
STOC '11, pages 343-352, 2011. arXiv:1011.2751. 

[23] F. G. Brandao and M. B. Plenio. A generalization of quantum Stein's lemma. Commun. Math. 
Phys., 295:791, 2010. arXiv:0904.0281. 

[24] F. G. S. L. Brandao, M. Christandl, and J. Yard. Faithful squashed entanglement. Commun. 
Math. Phys., 306(3):805-830, 2011. arXiv:1010.1750. 

[25] H. Buhrman, O. Regev, G. Scarpa, and R. de Wolf. Near-optimal and explicit bell inequality 
violations. In CCC 'II, pages 157-166, 2011. arXiv: 1012.5043. 

[26] J.-Y. Cai, A. Condon, and R. Lipton. Playing games of incomplete information. In STACS '90, 
volume 415, pages 58-69, 1990. 

[27] C. M. Caves, C. A. Fuchs, and R. Schack. Unknown quantum states: The quantum de Finetti 
representation. /. Math. Phys., 43(9):4537-4559, 2002. arXiv:quant-ph/0104088. 



33 



[28] A. Chailloux and O. Sattath. The complexity of the separable Hamiltonian problem, 2011. 
larXiv: 1111 .52471 



[29] J. Chen and A. Drucker. Short multi-prover quantum proofs for SAT without entangled mea- 
surements, 2010. arXiv:1011.0716. 

[30] A. Chiesa and M. Forbes. Improved soundness for QMA with multiple provers, 2011. 
arXiv:1108.2098. 

[31] M. Christandl, R. Koenig, and R. Renner. Post-selection technique for quantum channels with 
applications to quantum cryptography. Phys. Rev. Lett., 102:020504, 2009. arXiv:0809.3019. 

[32] M. Christandl, R. Konig, G. Mitchison, and R. Renner. One-and-a-half quantum de Finetti 
theorems. Commun. Math. Phys., 273:473^98, 2007. arXiv:quant-ph/0602130. 

[33] M. Christandl and B. Toner. Finite de Finetti theorem for conditional probability distributions 
describing physical theories. /. Math. Phys., 50(4):042104, 2009. arXiv:0712.0916. 

[34] R. Cleve, D. Gavinsky, and R. Jain. Entanglement-resistant two-prover interactive proof sys- 
tems and non-adaptive pir's. Quantum Info. Comput., 9(7):648-656, July 2009. arXiv:0707.1729. 

[35] R. Cleve, P. Hoyer, B. Toner, and J. Watrous. Consequences and limits of nonlocal strategies. 
In CCC '04, pages 236-249, 2004. arXiv:quant-ph/0404076. 

[36] F. Cobos, T. Kiihn, and J. Peetre. Remarks on symmetries of trilinear forms. Rev. R. Acad. 
Cienc. Exact. Fis.Nat. (Esp), 94(4):441-449, 2000. 

[37] J. P. D'Angelo and M. Putinar. Polynomial optimization on odd-dimensional spheres. In 
M. Putinar and S. Sullivant, editors, Emerging Applications of Algebraic Geometry, volume 149 
of The IMA Volumes in Mathematics and its Applications, pages 1-15. Springer New York, 2009. 

[38] E. de Klerk. The complexity of optimizing over a simplex, hypercube or sphere: A short 
survey. Central European journal of Operations Research, 16(2):111-125, 2008. 

[39] P. Diaconis and D. Freedman. Finite exchangeable sequences. Annals of Probability , 8:745-764, 
1980. 

[40] A. C. Doherty, Y.-C. Liang, B. Toner, and S. Wehner. The quantum moment problem and 
bounds on entangled multi-prover games. In CCC '08, pages 199-210, 2008. arXiv:0803.4373. 

[41] A. C. Doherty, P. A. Parrilo, and F. M. Spedalieri. Complete family of separability criteria. 
Phys. Rev. A, 69:022308, Feb 2004. arXiv:quant-ph/0308032. 

[42] A. C. Doherty and S. Wehner, 2009. personal communication. 

[43] A. C. Doherty and S. Wehner. Convergence of sdp hierarchies for polynomial optimization 
on the hypersphere, 2012. arXiv:1210.5048. 

[44] M. Fannes, J. T. Lewis, and A. Verbeure. Symmetric states of composite systems. Lett. Math. 
Phys., 15:255-260, 1988. 



34 



[45] S. Gharibian, J. Sikora, and S. Upadhyay. QMA variants with polynomially many provers, 
2011. larXiv: 1108.06171 

[46] A. W. Harrow and A. Montanaro. An efficient test for product states, with applications to 
quantum Merlin- Arthur games. In FOCS '10, pages 633-642, 2010. arXiv:1001.0017. 

[47] R. L. Hudson and G. R. Moody. Locally normal symmetric states and an analogue of de 
Finetti's theorem. Z. Wahrschein. verw. Geb., 33:343-351, 1976. 

[48] R. Impagliazzo and R. Paturi. On the complexity of k-sat. Journal of Computer and System 
Sciences, 62(2):367-375, 2001. 

[49] R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complex- 
ity? In FOCS'98, pages 653-662. IEEE, 1998. 

[50] L. Ioannou. Computational complexity of the quantum separability problem. Quantum Infor- 
mation and Computation, 7(4):335, 2007. 

[51] T. Ito. Polynomial-space approximation of no-signaling provers. In ICALP'10, pages 140-151, 

2010. arXiv:0908.2363. 

[52] T. Ito, H. Kobayashi, and K. Matsumoto. Oracularization and two-prover one-round interac- 
tive proofs against nonlocal strategies. In CCC '09, pages 217-228, 2009. arXiv:0810.0693. 

[53] T. Ito, H. Kobayashi, D. Preda, X. Sun, and A. C. C. Yao. Generalized tsirelson inequalities, 
commuting-operator provers, and multi-prover interactive proof systems. In CCC '08, pages 
187-198, 2008. arXiv:0712.2163. 

[54] T. Ito and T. Vidick. A multi-prover interactive proof for nexp sound against entangled 
provers. In FOCS '12, 2012. arXiv:1207.0550. 

[55] M. Kearns and U. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994. 

[56] J. Kempe, H. Kobayashi, K. Matsumoto, B. Toner, and T. Vidick. Entangled games are hard to 
approximate. SIAM J. Cotnput, 40(3):848~877, 2011. arXiv:0704.2903. 

[57] J. Kempe, O. Regev, and B. Toner. Unique games with entangled provers are easy. SIAM }. 
Comput., 39(7):3207-3229, July 2010. arXiv:0710.0655. 

[58] J. Kempe and T. Vidick. Parallel repetition of entangled games. In STOC '11, pages 353-362, 

2011. arXiv:1012.4728. 

[59] S. Khot and M. Safra. A two prover one round game with strong soundness. In FOCS '11, 
pages 648-657, 2011. 

[60] H. Kobayashi and K. Matsumoto. Quantum multi-prover interactive proof systems with 
limited prior entanglement. /. Comput. Syst. Sci, 66(3):429-450, May 2003. arXivxs/0102013. 

[61] H. Kobayashi, K. Matsumoto, and T. Yamakami. Quantum Merlin-Arthur proof systems: 
Are multiple Merlins more helpful to Arthur? In ISAAC, volume 2906, pages 189-198, 2003. 
arXiv:quant-ph/0306051. 



35 



[62] R. Koenig and R. Renner. A de Finetti representation for finite symmetric quantum states. /. 
Math. Phijs., 46(12):122108, 2005. arXiv:quant-ph/0410229. 

[63] C. Lancien and A. Winter. Distinguishing multi-partite states by local measurements, 2012. 
arXiv:1206.2884. 

[64] J. B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM J. 
Opt., 11(3):796-817, 2001. 

[65] F. Le Gall, S. Nakagawa, and H. Nishimura. On QMA protocols with two short quantum 
proofs. Quant. Inf. Comp., 12:0589, 2012. arXiv:1108.4306. 

[66] K. Li and A. Winter. Relative entropy and squashed entanglement, 2012. to appear. 

[67] N. G. LI. Masanes, A. Acin. General properties of nonsignaling theories. Phys. Rev. A., 
73:012112, 2006. 

[68] C. Marriott and J. Watrous. Quantum Arthur-Merlin games. Computational Complexity, 
14(2): 122-152, 2005. arXiv:cs/0506068. 

[69] M. McKague. On the power of quantum computation over real Hilbert spaces, 2011. 
larXiv: 1109.07951 

[70] M. Navascues, M. Owari, and M. B. Plenio. The power of symmetric extensions for entangle- 
ment detection. Phys. Rev. A, 80:052306, 2009. arXiv:0906.2731. 

[71] M. Navascues, S. Pironio, and A. Acin. A convergent hierarchy of semidefinite pro- 
grams characterizing the set of quantum correlations. New }. Phys., 10(7):073013, 2008. 
arXiv:0803.4290. 

[72] R. O'Donnell and Y. Zhou. Approximability and proof complexity. SODA 2013 (To appear). 

[73] P. A. Parrilo. Structured semidefinite programs and semialgebraic geometry methods in ro- 
bustness and optimization. Technical report, MIT, 2000. Ph.D thesis. 

[74] A. Pereszlenyi. Multi-prover quantum merlin-arthur proof systems with small gap, 2012. 
arXiv:1205.2761. 

[75] M. Piani. Relative entropy of entanglement and restricted measurements. Phys. Rev. Lett., 
103:160504, 2009. 

[76] V. Powers and B. Reznick. A new bound for Polya's theorem with applications to polynomials 
positive on polyhedra. Journal of Pure and Applied Algebra, 164(l-2):221-229, 2001. 

[77] G. A. Raggio and R. F. Werner. Quantum statistical mechanics of general mean field systems. 
Helv. Phys. Acta, 62:980-1003, 1989. 

[78] P. Raghavendra and N. Tan. Approximating CSPs with global cardinality constraints using 
sdp hierarchies. In SODA '12, pages 373-387, 2012. arXiv:1110.1064. 

[79] R. Renner. Security of quantum key distribution. PhD thesis, ETHZ, Zurich, 2005. arXiv:quant- 
ph/0512258. 



36 



[80] R. Renner. Symmetry implies independence. Nature Physics, 3:645-649, 2007. arXiv:quant- 
ph/0703069. 

[81] Y. Shi and X. Wu. Epsilon-net method for optimizations over separable states, 2011. 
arXiv:1112.0808. 

[82] E. Stormer. Symmetric states of infinite tensor products of c-algebras. /. Funct. Anal, 3:48, 
1969. 

[83] B. M. Terhal, A. C. Doherty, and D. Schwab. Symmetric extensions of quantum states and 
local hidden variable theories. Phys. Rev. Lett., 90:157903, 2003. arXiv:quant-ph/0210053. 

[84] J. Watrous. Quantum computational complexity, 2008. arXiv:0804.3401. 

[85] S. Wehner. Entanglement in interactive proof systems with binary answers. In STACS'06, 
pages 162-171, 2006. arXiv:quant-ph/0508201. 

[86] R. F. Werner. An application of Bell's inequalities to a quantum state extension problem. Lett . 
Math. Phys., 17:359, 1989. 

[87] D. Yang. A simple proof of monogamy of entanglement. Physics Letters A, 360(2) :249-250, 
2006. arXiv:quant-ph/0604168. 

[88] D. Yang, K. Horodecki, M. Horodecki, R Horodecki, J. Oppenheim, and W. Song. Squashed 
entanglement for multipartite states and entanglement measures based on the mixed convex 
roof. IEEE Trans. Inf. Ph., 55(7):3375-3387, July 2009. arXiv:0704.2236. 



37 



